Pat O’Reilly - AllState
“Anyone can fit an elephant”
Regularisation
L1, L2, ElasticNet
Island with gold reserves \(\rightarrow\) Predict deposits
Never a good thing
“That would be really cool!”
“You should definitely do that”
“So how are you thinking of generating the data…”
Mine gold, make profit
Bid price, extraction costs, profit margins
Emphasise use of prediction, accuracy less important
Something for everyone
Vickrey auction: Highest bidder wins, pays second highest price
Easy to understand, code, fast to run
Harder to ‘game’
Starting capital
50 parcels of land, sequential auctions
\[ \text{Score} = \text{Start} - \text{Bids Paid} - \text{Extraction Costs} + \text{Sales} \]
Online datasets pointless
How do I generate data?
Make it up!
Hard to reverse engineer
No ‘true’ model
2D Gaussian Processes
\[ 150 \times 150 = 22,500 \; \text{land parcels} \]
Tested with simple models
5,000 parcels in training set
Team submits bids for each parcel
## parcel_id,bid_amount
## 515,734503.19
## 914,883127.20
## 1538,56130.52
Simple Shiny app
CSV upload with time stamp
Manual submissions - emails
12 submissions
Facebook, mid-September 2016
7-8 teams
Most teams single person
Speakers described approach (5-10 mins)
Clustering, Cersei Lannister, Qlik
Mini-competitions
Hospital
Unsophisticated model
Use variables to predict unknown variables to then predict gold
Focus on most profitable parcels
Overspend on auctions
I am bad at web code
It was a lot of fun
Definitely want to do it again
On that note …
I am Henry the VIIIth I am
Tonight: Expressions of Interest
December: Competition Launch
Jan/Feb: Competition Close and Results Night